Skip to content

osctrl-frontend: React admin SPA at frontend/ (round 3 of 3)#815

Open
alvarofraguas wants to merge 4 commits into
jmpsec:mainfrom
alvarofraguas:pr/round-3-frontend
Open

osctrl-frontend: React admin SPA at frontend/ (round 3 of 3)#815
alvarofraguas wants to merge 4 commits into
jmpsec:mainfrom
alvarofraguas:pr/round-3-frontend

Conversation

@alvarofraguas
Copy link
Copy Markdown

Summary

Round 3 of 3. Lands the React + TypeScript + Vite SPA under a new frontend/ directory at the repo root. The SPA fully covers what the legacy osctrl-admin templates do — every page surface is replicated. Both UIs can run side-by-side during a migration window (dev compose serves the SPA on :8088 while the legacy admin stays on :8443); the legacy admin is not touched by this PR.

⚠️ Stacked on #813 + #814. When those merge in order, this branch will be re-targeted at the new main HEAD with no conflicts.

End-to-end tested against a Kali docker deployment.

What's in frontend/

frontend/
├── package.json + package-lock.json
├── vite.config.ts                  ← Vite 7, /api → :8081 dev proxy
├── tsconfig.json                   ← TS 5 strict
├── public/favicon.svg              ← osctrl Geometric tower mark
├── scripts/copy-monaco.mjs         ← Self-host Monaco runtime under CSP
├── monaco-runtime.sha256           ← Supply-chain pin for Monaco
└── src/
    ├── main.tsx                    ← Router + Query bootstrap
    ├── routes/                     ← TanStack Router (file-based)
    ├── components/                 ← primitives / atoms / chrome / data / forms
    ├── features/                   ← One folder per page surface
    ├── api/                        ← Typed clients per resource
    ├── lib/                        ← cn, time, design-tokens
    └── styles/                     ← tokens.css + Tailwind base

Tech stack

  • React 19 + TypeScript 5 (strict)
  • Vite 7 + @tailwindcss/vite + Tailwind CSS v4
  • TanStack Router (typed file-based routing), TanStack Query 5, TanStack Table 8
  • react-hook-form 7 + zod 3
  • Radix UI primitives (à la carte), lucide-react (icons)
  • Monaco editor (lazy-loaded for osquery / config editors)
  • Vitest + @testing-library/react + jsdom

Bundle: ~780 KB JS / ~52 KB CSS pre-compression → ~214 KB JS + ~9 KB CSS after gzip. Monaco is code-split into its own chunk so pages that don't use it don't pay the editor cost.

Pages

Every page surface the legacy admin offers is covered:

  • Login (env picker via pre-auth /login/environments)
  • Dashboard (cross-env KPIs, agent versions, active queries, recently seen, failed enrolls)
  • Nodes table (paginated, sortable, searchable, quick-filters) — 4×24h activity heatmap per row
  • Node detail (system info, status logs, result logs, distributed queries, carves, activity tab)
  • Queries (list, run form with target selector + Monaco editor, results with virtual-scroll + CSV export, saved queries CRUD)
  • Carves (list, run form, detail with archive download)
  • Tags (env-scoped + global)
  • Users (list, permissions modal, token modal)
  • Profile (password change, token refresh)
  • Environments (list, create, edit, Monaco-based config editor with DiffView)
  • Enroll page (per-OS one-liners + downloads)
  • Audit log (paginated, filtered)
  • Settings (per-service, typed inputs)

Design system

Locked tokens, captured in frontend/src/styles/tokens.css and frontend/src/lib/design-tokens.ts (kept in sync):

  • Dark default, full light parity — data-theme="dark|light" on <html>
  • Signal-teal accent (#2bc4be dark / #0a8a85 light), one accent active per screen
  • Semantic status colors that always carry an icon or label (a11y)
  • Inter (body) + Space Grotesk (display / KPIs) + IBM Plex Mono (UUIDs / timestamps / cells)
  • Tabular nums throughout, no row-jitter on refresh
  • Density modes (comfortable / compact / dense) via CSS custom properties

Auth flow

  • HttpOnly osctrl_token cookie set by the API on login (no token in localStorage)
  • Double-submit CSRF (osctrl_csrf cookie + X-CSRF-Token header) for mutating requests
  • 401 on any endpoint → redirect to /login/$env?next=...
  • CLI / Bearer clients unaffected (no cookie present → no CSRF needed)

Deployment

Three patterns, all reference each other for consistency:

  1. nginx (recommended) — deploy/nginx/frontend.conf.example shows the production pattern: root + try_files for the SPA, /api/* to osctrl-api, baseline security headers (HSTS / CSP / X-CTO / XFO / Referrer-Policy / Permissions-Policy), immutable cache for hashed assets, no-cache for index.html.
  2. Dockerdeploy/docker/dockerfiles/Dockerfile-osctrl-frontend: multi-stage (node:20 builds dist/, nginx:alpine serves it + reverse-proxies /api/*). Single image, single binary's worth of operational surface.
  3. Static hosting + CDN — upload frontend/dist/ to S3/Cloudfront/etc., configure CORS on osctrl-api.

The dev compose stack adds an osctrl-frontend service that builds the same multi-stage image on :8088 alongside the legacy admin on :8443 so operators can compare the two on the same data.

Make targets

Target Effect
make frontend-install npm ci
make frontend-dev Vite dev server on :5173, proxies /api:8081
make frontend-test vitest + tsc
make frontend-build Produces frontend/dist/
make frontend install + build (CI / Docker shorthand)

CI

.github/workflows/frontend-build.yml:

  • Pinned action SHAs (matches osctrl convention)
  • Typecheck (npm run checktsc --noEmit)
  • Tests (npm test → vitest)
  • Build (npm run build → vite)
  • dangerouslySetInnerHTML gate: build fails if it appears anywhere under src/. Every node-originating field must be JSX-escaped — this gate prevents a future contributor from silently regressing the XSS surface.
  • Uploads frontend/dist/ as a 7-day artifact

Test plan

  • npx tsc --noEmit — clean
  • npx vitest run — 19 test files, 92 tests pass
  • npm run build — produces frontend/dist/ cleanly
  • Backend untouched: go build ./..., go vet ./..., all 14 Go packages' tests pass
  • End-to-end smoke against a Kali docker deployment (login → nodes table → run a query → see results → carve a file → log out)

Why a separate frontend/ directory

  • The SPA's build / test / lint loop is fundamentally npm-based; it doesn't want to sit inside cmd/* next to the Go binaries.
  • A separate top-level folder makes it obvious to drive-by contributors what the directory is and what tooling expectations apply.
  • Per-folder CI: the workflow defaults to working-directory: frontend and can evolve independently of the Go workflows.

@javuto javuto added ✨ enhancement New feature or request ⭐️ frontend Frontend related issues labels May 13, 2026
alvarofraguas pushed a commit to alvarofraguas/osctrl that referenced this pull request May 13, 2026
Consolidated follow-up that lands on top of the three stacked PRs to
address lint, a real bug uncovered by lint, a stronger JWT-secret
contract, and a few deployment-correctness items.

== Lint cleanup (golangci-lint on PR jmpsec#815) ==

- pkg/auditlog/audit.go, pkg/dbutil/buckets.go: drop the redundant
  `.Dialector` selector on `*gorm.DB` (QF1008). `Dialector` is an
  embedded interface so the promoted `Name()` works directly.
- cmd/api/handlers/utils.go: remove the unused `postgresQueryLogs`
  function (unused). Pre-existing dead code that surfaced once the
  package was touched by other PRs in the stack.
- cmd/admin/handlers/json-nodes.go: annotate the two legacy admin
  callers of `Nodes.SearchByEnvPage` / `Nodes.GetByEnvPage` with
  `//nolint:staticcheck // SA1019: intentional legacy admin caller;
  new SPA uses GetByEnvPaged`. The deprecation tag is correct — the
  legacy admin will migrate to `GetByEnvPaged` when it adopts the
  SPA's pagination shape; until then these calls are gated by the
  package-layer `SortableColumns` allowlist and are safe.

== Real bug uncovered by ineffassign ==

cmd/api/handlers/environments.go: in the `"create"` action, the
tag-creation failure path set `msgReturn = fmt.Sprintf("error
generating tag %s ", err.Error())` and then `return`-ed without ever
writing to the response. Result: the API returned the request body's
buffered HTTP 200 (or no body at all) on a real failure, masking the
error from the client. Replaced with a proper `apiErrorResponse(w,
"error generating tag", http.StatusInternalServerError, err)`.

== JWT secret contract: decouple user-manager construction from
   token-signing config ==

Round 1 added `MinJWTSecretBytes = 32` to `users.CreateUserManager`,
which `log.Fatal`s when the JWT secret is shorter than 32 bytes. This
was correct for the API and admin services (they sign tokens) but
caught the CLI (`cmd/cli`) by surprise — the CLI doesn't mint JWTs at
all, it just manages user/permission rows directly, and was passing
`appName` ("osctrl", 6 bytes) as a placeholder. Every CLI invocation
with `--db` would have aborted with "JWT Secret too short" once
Round 1 lands.

Fix: split the constructor.

  // DB-only constructor — never validates JWT.
  func CreateUserManager(backend *gorm.DB) *UserManager

  // Attach JWT signing config; validates the secret here.
  func (u *UserManager) WithJWT(*config.YAMLConfigurationJWT) *UserManager

Token-issuing callers chain:

  apiUsers   = users.CreateUserManager(db.Conn).WithJWT(flagParams.JWT)
  adminUsers = users.CreateUserManager(db.Conn).WithJWT(flagParams.JWT)

The CLI calls only `CreateUserManager(db.Conn)`. `CreateToken` now
returns an error if invoked on a manager without `WithJWT` —
defense-in-depth so a future caller can't accidentally sign tokens
with a nil config.

== TLS handler: per-endpoint body caps ==

cmd/tls/handlers/post.go: wrap every osquery-facing `io.ReadAll` in
`http.MaxBytesReader` with an endpoint-appropriate ceiling. Without
this, a misbehaving or hostile node can submit an arbitrarily large
body and force the server to buffer it before parsing. Caps are
chosen against real osquery payload sizes:

  enroll:           64 KiB
  config:           64 KiB
  log:             100 MiB   (osquery log batches)
  query read:       16 KiB
  query write:     100 MiB
  carve init:        8 KiB
  carve block:      16 MiB
  quick-enroll:      8 KiB
  flags / cert:      8 KiB
  verify / script:   8 KiB
  osquery config:    2 MiB

A single `readBody(w, r, max)` helper applies the cap and reads, so
the call sites stay one-line.

== Carve compression: out-of-bounds panic guard ==

pkg/carves/utils.go: `CheckCompressionRaw` previously dereferenced
`data[:4]` to compare against the zstd header. A truncated or empty
block (which an authenticated CarveLevel client can submit) would
panic with index-out-of-range. Guarded with a length check.

== Frontend (PR jmpsec#815 scope) ==

- frontend/index.html: removed the inline `<script>` that bootstraps
  the theme attribute and moved it to
  `frontend/public/theme-bootstrap.js`. The inline form violated the
  CSP `script-src 'self' blob:` deployed on the SPA (45 CSP errors per
  page load in the prior audit).
- frontend/package.json (+ package-lock.json): add an `overrides`
  block pinning `dompurify@^3.4.3` to remediate the transitive
  vulnerability advisory.
- frontend/src/features/dashboard/DashboardPage.tsx: wire the
  time-series chart to real `getEnvActivity` data instead of the
  placeholder constants. Added a 7d / 24h interactive toggle and an
  `aggregateBuckets` helper that collapses the 15-min server buckets
  into N display bins.

== Dev stack ==

deploy/docker/conf/osquery/entrypoint.sh: pin
`--host_identifier=specified --specified_identifier=$(hostname)` so
the three dev osquery containers enroll as distinct nodes instead of
colliding on the host kernel UUID. The full Kali docker stack now
enrolls four unique nodes (the Kali host + three containers).

== Toolchain ==

Bump Go from 1.26.1 to 1.26.3 across go.mod, the four
deploy/docker/dockerfiles/Dockerfile-dev-*, deploy/lib.sh,
.env.example, the .github/actions/{build,test}/binaries action
manifests, and the five .github/workflows/*.yml. 1.26.3 carries the
stdlib CVE fixes flagged by the local govulncheck run.

Verified locally:
- go build ./...  (clean on Go 1.26.3)
- go test  ./...  (all packages green, including new CreateUserManager
  / WithJWT tests)
- frontend: vitest 92/92 green; npm run build clean; npm audit
  --omit=dev → 0 vulnerabilities
- Live smoke on the Kali dev stack: container rebuild + recreate;
  osctrl-cli runs `env show`, `node-actions secret`, `show-flags`
  without the old JWT-too-short fatal; osctrl-api signs a real login
  token through the SPA; the four enrolled osquery nodes remain in
  the DB.
alvarofraguas pushed a commit to alvarofraguas/osctrl that referenced this pull request May 14, 2026
Consolidated follow-up that lands on top of the three stacked PRs to
address lint, a real bug uncovered by lint, a stronger JWT-secret
contract, and a few deployment-correctness items.

== Lint cleanup (golangci-lint on PR jmpsec#815) ==

- pkg/auditlog/audit.go, pkg/dbutil/buckets.go: drop the redundant
  `.Dialector` selector on `*gorm.DB` (QF1008). `Dialector` is an
  embedded interface so the promoted `Name()` works directly.
- cmd/api/handlers/utils.go: remove the unused `postgresQueryLogs`
  function (unused). Pre-existing dead code that surfaced once the
  package was touched by other PRs in the stack.
- cmd/admin/handlers/json-nodes.go: annotate the two legacy admin
  callers of `Nodes.SearchByEnvPage` / `Nodes.GetByEnvPage` with
  `//nolint:staticcheck // SA1019: intentional legacy admin caller;
  new SPA uses GetByEnvPaged`. The deprecation tag is correct — the
  legacy admin will migrate to `GetByEnvPaged` when it adopts the
  SPA's pagination shape; until then these calls are gated by the
  package-layer `SortableColumns` allowlist and are safe.

== Real bug uncovered by ineffassign ==

cmd/api/handlers/environments.go: in the `"create"` action, the
tag-creation failure path set `msgReturn = fmt.Sprintf("error
generating tag %s ", err.Error())` and then `return`-ed without ever
writing to the response. Result: the API returned the request body's
buffered HTTP 200 (or no body at all) on a real failure, masking the
error from the client. Replaced with a proper `apiErrorResponse(w,
"error generating tag", http.StatusInternalServerError, err)`.

== JWT secret contract: decouple user-manager construction from
   token-signing config ==

Round 1 added `MinJWTSecretBytes = 32` to `users.CreateUserManager`,
which `log.Fatal`s when the JWT secret is shorter than 32 bytes. This
was correct for the API and admin services (they sign tokens) but
caught the CLI (`cmd/cli`) by surprise — the CLI doesn't mint JWTs at
all, it just manages user/permission rows directly, and was passing
`appName` ("osctrl", 6 bytes) as a placeholder. Every CLI invocation
with `--db` would have aborted with "JWT Secret too short" once
Round 1 lands.

Fix: split the constructor.

  // DB-only constructor — never validates JWT.
  func CreateUserManager(backend *gorm.DB) *UserManager

  // Attach JWT signing config; validates the secret here.
  func (u *UserManager) WithJWT(*config.YAMLConfigurationJWT) *UserManager

Token-issuing callers chain:

  apiUsers   = users.CreateUserManager(db.Conn).WithJWT(flagParams.JWT)
  adminUsers = users.CreateUserManager(db.Conn).WithJWT(flagParams.JWT)

The CLI calls only `CreateUserManager(db.Conn)`. `CreateToken` now
returns an error if invoked on a manager without `WithJWT` —
defense-in-depth so a future caller can't accidentally sign tokens
with a nil config.

== TLS handler: per-endpoint body caps ==

cmd/tls/handlers/post.go: wrap every osquery-facing `io.ReadAll` in
`http.MaxBytesReader` with an endpoint-appropriate ceiling. Without
this, a misbehaving or hostile node can submit an arbitrarily large
body and force the server to buffer it before parsing. Caps are
chosen against real osquery payload sizes:

  enroll:           64 KiB
  config:           64 KiB
  log:             100 MiB   (osquery log batches)
  query read:       16 KiB
  query write:     100 MiB
  carve init:        8 KiB
  carve block:      16 MiB
  quick-enroll:      8 KiB
  flags / cert:      8 KiB
  verify / script:   8 KiB
  osquery config:    2 MiB

A single `readBody(w, r, max)` helper applies the cap and reads, so
the call sites stay one-line.

== Carve compression: out-of-bounds panic guard ==

pkg/carves/utils.go: `CheckCompressionRaw` previously dereferenced
`data[:4]` to compare against the zstd header. A truncated or empty
block (which an authenticated CarveLevel client can submit) would
panic with index-out-of-range. Guarded with a length check.

== Frontend (PR jmpsec#815 scope) ==

- frontend/index.html: removed the inline `<script>` that bootstraps
  the theme attribute and moved it to
  `frontend/public/theme-bootstrap.js`. The inline form violated the
  CSP `script-src 'self' blob:` deployed on the SPA (45 CSP errors per
  page load in the prior audit).
- frontend/package.json (+ package-lock.json): add an `overrides`
  block pinning `dompurify@^3.4.3` to remediate the transitive
  vulnerability advisory.
- frontend/src/features/dashboard/DashboardPage.tsx: wire the
  time-series chart to real `getEnvActivity` data instead of the
  placeholder constants. Added a 7d / 24h interactive toggle and an
  `aggregateBuckets` helper that collapses the 15-min server buckets
  into N display bins.

== Dev stack ==

deploy/docker/conf/osquery/entrypoint.sh: pin
`--host_identifier=specified --specified_identifier=$(hostname)` so
the three dev osquery containers enroll as distinct nodes instead of
colliding on the host kernel UUID. The full Kali docker stack now
enrolls four unique nodes (the Kali host + three containers).

== Toolchain ==

Bump Go from 1.26.1 to 1.26.3 across go.mod, the four
deploy/docker/dockerfiles/Dockerfile-dev-*, deploy/lib.sh,
.env.example, the .github/actions/{build,test}/binaries action
manifests, and the five .github/workflows/*.yml. 1.26.3 carries the
stdlib CVE fixes flagged by the local govulncheck run.

Verified locally:
- go build ./...  (clean on Go 1.26.3)
- go test  ./...  (all packages green, including new CreateUserManager
  / WithJWT tests)
- frontend: vitest 92/92 green; npm run build clean; npm audit
  --omit=dev → 0 vulnerabilities
- Live smoke on the Kali dev stack: container rebuild + recreate;
  osctrl-cli runs `env show`, `node-actions secret`, `show-flags`
  without the old JWT-too-short fatal; osctrl-api signs a real login
  token through the SPA; the four enrolled osquery nodes remain in
  the DB.
@alvarofraguas alvarofraguas force-pushed the pr/round-3-frontend branch from 4235626 to 6c0584b Compare May 14, 2026 08:45
alvarofraguas pushed a commit to alvarofraguas/osctrl that referenced this pull request May 14, 2026
Consolidated follow-up that lands on top of the three stacked PRs to
address lint, a real bug uncovered by lint, a stronger JWT-secret
contract, and a few deployment-correctness items.

== Lint cleanup (golangci-lint on PR jmpsec#815) ==

- pkg/auditlog/audit.go, pkg/dbutil/buckets.go: drop the redundant
  `.Dialector` selector on `*gorm.DB` (QF1008). `Dialector` is an
  embedded interface so the promoted `Name()` works directly.
- cmd/api/handlers/utils.go: remove the unused `postgresQueryLogs`
  function (unused). Pre-existing dead code that surfaced once the
  package was touched by other PRs in the stack.
- cmd/admin/handlers/json-nodes.go: annotate the two legacy admin
  callers of `Nodes.SearchByEnvPage` / `Nodes.GetByEnvPage` with
  `//nolint:staticcheck // SA1019: intentional legacy admin caller;
  new SPA uses GetByEnvPaged`. The deprecation tag is correct — the
  legacy admin will migrate to `GetByEnvPaged` when it adopts the
  SPA's pagination shape; until then these calls are gated by the
  package-layer `SortableColumns` allowlist and are safe.

== Real bug uncovered by ineffassign ==

cmd/api/handlers/environments.go: in the `"create"` action, the
tag-creation failure path set `msgReturn = fmt.Sprintf("error
generating tag %s ", err.Error())` and then `return`-ed without ever
writing to the response. Result: the API returned the request body's
buffered HTTP 200 (or no body at all) on a real failure, masking the
error from the client. Replaced with a proper `apiErrorResponse(w,
"error generating tag", http.StatusInternalServerError, err)`.

== JWT secret contract: decouple user-manager construction from
   token-signing config ==

Round 1 added `MinJWTSecretBytes = 32` to `users.CreateUserManager`,
which `log.Fatal`s when the JWT secret is shorter than 32 bytes. This
was correct for the API and admin services (they sign tokens) but
caught the CLI (`cmd/cli`) by surprise — the CLI doesn't mint JWTs at
all, it just manages user/permission rows directly, and was passing
`appName` ("osctrl", 6 bytes) as a placeholder. Every CLI invocation
with `--db` would have aborted with "JWT Secret too short" once
Round 1 lands.

Fix: split the constructor.

  // DB-only constructor — never validates JWT.
  func CreateUserManager(backend *gorm.DB) *UserManager

  // Attach JWT signing config; validates the secret here.
  func (u *UserManager) WithJWT(*config.YAMLConfigurationJWT) *UserManager

Token-issuing callers chain:

  apiUsers   = users.CreateUserManager(db.Conn).WithJWT(flagParams.JWT)
  adminUsers = users.CreateUserManager(db.Conn).WithJWT(flagParams.JWT)

The CLI calls only `CreateUserManager(db.Conn)`. `CreateToken` now
returns an error if invoked on a manager without `WithJWT` —
defense-in-depth so a future caller can't accidentally sign tokens
with a nil config.

== TLS handler: per-endpoint body caps ==

cmd/tls/handlers/post.go: wrap every osquery-facing `io.ReadAll` in
`http.MaxBytesReader` with an endpoint-appropriate ceiling. Without
this, a misbehaving or hostile node can submit an arbitrarily large
body and force the server to buffer it before parsing. Caps are
chosen against real osquery payload sizes:

  enroll:           64 KiB
  config:           64 KiB
  log:             100 MiB   (osquery log batches)
  query read:       16 KiB
  query write:     100 MiB
  carve init:        8 KiB
  carve block:      16 MiB
  quick-enroll:      8 KiB
  flags / cert:      8 KiB
  verify / script:   8 KiB
  osquery config:    2 MiB

A single `readBody(w, r, max)` helper applies the cap and reads, so
the call sites stay one-line.

== Carve compression: out-of-bounds panic guard ==

pkg/carves/utils.go: `CheckCompressionRaw` previously dereferenced
`data[:4]` to compare against the zstd header. A truncated or empty
block (which an authenticated CarveLevel client can submit) would
panic with index-out-of-range. Guarded with a length check.

== Frontend (PR jmpsec#815 scope) ==

- frontend/index.html: removed the inline `<script>` that bootstraps
  the theme attribute and moved it to
  `frontend/public/theme-bootstrap.js`. The inline form violated the
  CSP `script-src 'self' blob:` deployed on the SPA (45 CSP errors per
  page load in the prior audit).
- frontend/package.json (+ package-lock.json): add an `overrides`
  block pinning `dompurify@^3.4.3` to remediate the transitive
  vulnerability advisory.
- frontend/src/features/dashboard/DashboardPage.tsx: wire the
  time-series chart to real `getEnvActivity` data instead of the
  placeholder constants. Added a 7d / 24h interactive toggle and an
  `aggregateBuckets` helper that collapses the 15-min server buckets
  into N display bins.

== Dev stack ==

deploy/docker/conf/osquery/entrypoint.sh: pin
`--host_identifier=specified --specified_identifier=$(hostname)` so
the three dev osquery containers enroll as distinct nodes instead of
colliding on the host kernel UUID. The full Kali docker stack now
enrolls four unique nodes (the Kali host + three containers).

== Toolchain ==

Bump Go from 1.26.1 to 1.26.3 across go.mod, the four
deploy/docker/dockerfiles/Dockerfile-dev-*, deploy/lib.sh,
.env.example, the .github/actions/{build,test}/binaries action
manifests, and the five .github/workflows/*.yml. 1.26.3 carries the
stdlib CVE fixes flagged by the local govulncheck run.

Verified locally:
- go build ./...  (clean on Go 1.26.3)
- go test  ./...  (all packages green, including new CreateUserManager
  / WithJWT tests)
- frontend: vitest 92/92 green; npm run build clean; npm audit
  --omit=dev → 0 vulnerabilities
- Live smoke on the Kali dev stack: container rebuild + recreate;
  osctrl-cli runs `env show`, `node-actions secret`, `show-flags`
  without the old JWT-too-short fatal; osctrl-api signs a real login
  token through the SPA; the four enrolled osquery nodes remain in
  the DB.
@alvarofraguas alvarofraguas force-pushed the pr/round-3-frontend branch from 6c0584b to 36f4e89 Compare May 14, 2026 17:04
alvarofraguas added a commit to alvarofraguas/osctrl that referenced this pull request May 14, 2026
Consolidated follow-up that lands on top of the three stacked PRs to
address lint, a real bug uncovered by lint, a stronger JWT-secret
contract, and a few deployment-correctness items.

== Lint cleanup (golangci-lint on PR jmpsec#815) ==

- pkg/auditlog/audit.go, pkg/dbutil/buckets.go: drop the redundant
  `.Dialector` selector on `*gorm.DB` (QF1008). `Dialector` is an
  embedded interface so the promoted `Name()` works directly.
- cmd/api/handlers/utils.go: remove the unused `postgresQueryLogs`
  function (unused). Pre-existing dead code that surfaced once the
  package was touched by other PRs in the stack.
- cmd/admin/handlers/json-nodes.go: annotate the two legacy admin
  callers of `Nodes.SearchByEnvPage` / `Nodes.GetByEnvPage` with
  `//nolint:staticcheck // SA1019: intentional legacy admin caller;
  new SPA uses GetByEnvPaged`. The deprecation tag is correct — the
  legacy admin will migrate to `GetByEnvPaged` when it adopts the
  SPA's pagination shape; until then these calls are gated by the
  package-layer `SortableColumns` allowlist and are safe.

== Real bug uncovered by ineffassign ==

cmd/api/handlers/environments.go: in the `"create"` action, the
tag-creation failure path set `msgReturn = fmt.Sprintf("error
generating tag %s ", err.Error())` and then `return`-ed without ever
writing to the response. Result: the API returned the request body's
buffered HTTP 200 (or no body at all) on a real failure, masking the
error from the client. Replaced with a proper `apiErrorResponse(w,
"error generating tag", http.StatusInternalServerError, err)`.

== JWT secret contract: decouple user-manager construction from
   token-signing config ==

Round 1 added `MinJWTSecretBytes = 32` to `users.CreateUserManager`,
which `log.Fatal`s when the JWT secret is shorter than 32 bytes. This
was correct for the API and admin services (they sign tokens) but
caught the CLI (`cmd/cli`) by surprise — the CLI doesn't mint JWTs at
all, it just manages user/permission rows directly, and was passing
`appName` ("osctrl", 6 bytes) as a placeholder. Every CLI invocation
with `--db` would have aborted with "JWT Secret too short" once
Round 1 lands.

Fix: split the constructor.

  // DB-only constructor — never validates JWT.
  func CreateUserManager(backend *gorm.DB) *UserManager

  // Attach JWT signing config; validates the secret here.
  func (u *UserManager) WithJWT(*config.YAMLConfigurationJWT) *UserManager

Token-issuing callers chain:

  apiUsers   = users.CreateUserManager(db.Conn).WithJWT(flagParams.JWT)
  adminUsers = users.CreateUserManager(db.Conn).WithJWT(flagParams.JWT)

The CLI calls only `CreateUserManager(db.Conn)`. `CreateToken` now
returns an error if invoked on a manager without `WithJWT` —
defense-in-depth so a future caller can't accidentally sign tokens
with a nil config.

== TLS handler: per-endpoint body caps ==

cmd/tls/handlers/post.go: wrap every osquery-facing `io.ReadAll` in
`http.MaxBytesReader` with an endpoint-appropriate ceiling. Without
this, a misbehaving or hostile node can submit an arbitrarily large
body and force the server to buffer it before parsing. Caps are
chosen against real osquery payload sizes:

  enroll:           64 KiB
  config:           64 KiB
  log:             100 MiB   (osquery log batches)
  query read:       16 KiB
  query write:     100 MiB
  carve init:        8 KiB
  carve block:      16 MiB
  quick-enroll:      8 KiB
  flags / cert:      8 KiB
  verify / script:   8 KiB
  osquery config:    2 MiB

A single `readBody(w, r, max)` helper applies the cap and reads, so
the call sites stay one-line.

== Carve compression: out-of-bounds panic guard ==

pkg/carves/utils.go: `CheckCompressionRaw` previously dereferenced
`data[:4]` to compare against the zstd header. A truncated or empty
block (which an authenticated CarveLevel client can submit) would
panic with index-out-of-range. Guarded with a length check.

== Frontend (PR jmpsec#815 scope) ==

- frontend/index.html: removed the inline `<script>` that bootstraps
  the theme attribute and moved it to
  `frontend/public/theme-bootstrap.js`. The inline form violated the
  CSP `script-src 'self' blob:` deployed on the SPA (45 CSP errors per
  page load in the prior audit).
- frontend/package.json (+ package-lock.json): add an `overrides`
  block pinning `dompurify@^3.4.3` to remediate the transitive
  vulnerability advisory.
- frontend/src/features/dashboard/DashboardPage.tsx: wire the
  time-series chart to real `getEnvActivity` data instead of the
  placeholder constants. Added a 7d / 24h interactive toggle and an
  `aggregateBuckets` helper that collapses the 15-min server buckets
  into N display bins.

== Dev stack ==

deploy/docker/conf/osquery/entrypoint.sh: pin
`--host_identifier=specified --specified_identifier=$(hostname)` so
the three dev osquery containers enroll as distinct nodes instead of
colliding on the host kernel UUID. The full Kali docker stack now
enrolls four unique nodes (the Kali host + three containers).

== Toolchain ==

Bump Go from 1.26.1 to 1.26.3 across go.mod, the four
deploy/docker/dockerfiles/Dockerfile-dev-*, deploy/lib.sh,
.env.example, the .github/actions/{build,test}/binaries action
manifests, and the five .github/workflows/*.yml. 1.26.3 carries the
stdlib CVE fixes flagged by the local govulncheck run.

Verified locally:
- go build ./...  (clean on Go 1.26.3)
- go test  ./...  (all packages green, including new CreateUserManager
  / WithJWT tests)
- frontend: vitest 92/92 green; npm run build clean; npm audit
  --omit=dev → 0 vulnerabilities
- Live smoke on the Kali dev stack: container rebuild + recreate;
  osctrl-cli runs `env show`, `node-actions secret`, `show-flags`
  without the old JWT-too-short fatal; osctrl-api signs a real login
  token through the SPA; the four enrolled osquery nodes remain in
  the DB.
@alvarofraguas alvarofraguas force-pushed the pr/round-3-frontend branch from 36f4e89 to 8189434 Compare May 14, 2026 17:08
alvarofraguas added a commit to alvarofraguas/osctrl that referenced this pull request May 14, 2026
Consolidated follow-up that lands on top of the three stacked PRs to
address lint, a real bug uncovered by lint, a stronger JWT-secret
contract, and a few deployment-correctness items.

== Lint cleanup (golangci-lint on PR jmpsec#815) ==

- pkg/auditlog/audit.go, pkg/dbutil/buckets.go: drop the redundant
  `.Dialector` selector on `*gorm.DB` (QF1008). `Dialector` is an
  embedded interface so the promoted `Name()` works directly.
- cmd/api/handlers/utils.go: remove the unused `postgresQueryLogs`
  function (unused). Pre-existing dead code that surfaced once the
  package was touched by other PRs in the stack.
- cmd/admin/handlers/json-nodes.go: annotate the two legacy admin
  callers of `Nodes.SearchByEnvPage` / `Nodes.GetByEnvPage` with
  `//nolint:staticcheck // SA1019: intentional legacy admin caller;
  new SPA uses GetByEnvPaged`. The deprecation tag is correct — the
  legacy admin will migrate to `GetByEnvPaged` when it adopts the
  SPA's pagination shape; until then these calls are gated by the
  package-layer `SortableColumns` allowlist and are safe.

== Real bug uncovered by ineffassign ==

cmd/api/handlers/environments.go: in the `"create"` action, the
tag-creation failure path set `msgReturn = fmt.Sprintf("error
generating tag %s ", err.Error())` and then `return`-ed without ever
writing to the response. Result: the API returned the request body's
buffered HTTP 200 (or no body at all) on a real failure, masking the
error from the client. Replaced with a proper `apiErrorResponse(w,
"error generating tag", http.StatusInternalServerError, err)`.

== JWT secret contract: decouple user-manager construction from
   token-signing config ==

Round 1 added `MinJWTSecretBytes = 32` to `users.CreateUserManager`,
which `log.Fatal`s when the JWT secret is shorter than 32 bytes. This
was correct for the API and admin services (they sign tokens) but
caught the CLI (`cmd/cli`) by surprise — the CLI doesn't mint JWTs at
all, it just manages user/permission rows directly, and was passing
`appName` ("osctrl", 6 bytes) as a placeholder. Every CLI invocation
with `--db` would have aborted with "JWT Secret too short" once
Round 1 lands.

Fix: split the constructor.

  // DB-only constructor — never validates JWT.
  func CreateUserManager(backend *gorm.DB) *UserManager

  // Attach JWT signing config; validates the secret here.
  func (u *UserManager) WithJWT(*config.YAMLConfigurationJWT) *UserManager

Token-issuing callers chain:

  apiUsers   = users.CreateUserManager(db.Conn).WithJWT(flagParams.JWT)
  adminUsers = users.CreateUserManager(db.Conn).WithJWT(flagParams.JWT)

The CLI calls only `CreateUserManager(db.Conn)`. `CreateToken` now
returns an error if invoked on a manager without `WithJWT` —
defense-in-depth so a future caller can't accidentally sign tokens
with a nil config.

== TLS handler: per-endpoint body caps ==

cmd/tls/handlers/post.go: wrap every osquery-facing `io.ReadAll` in
`http.MaxBytesReader` with an endpoint-appropriate ceiling. Without
this, a misbehaving or hostile node can submit an arbitrarily large
body and force the server to buffer it before parsing. Caps are
chosen against real osquery payload sizes:

  enroll:           64 KiB
  config:           64 KiB
  log:             100 MiB   (osquery log batches)
  query read:       16 KiB
  query write:     100 MiB
  carve init:        8 KiB
  carve block:      16 MiB
  quick-enroll:      8 KiB
  flags / cert:      8 KiB
  verify / script:   8 KiB
  osquery config:    2 MiB

A single `readBody(w, r, max)` helper applies the cap and reads, so
the call sites stay one-line.

== Carve compression: out-of-bounds panic guard ==

pkg/carves/utils.go: `CheckCompressionRaw` previously dereferenced
`data[:4]` to compare against the zstd header. A truncated or empty
block (which an authenticated CarveLevel client can submit) would
panic with index-out-of-range. Guarded with a length check.

== Frontend (PR jmpsec#815 scope) ==

- frontend/index.html: removed the inline `<script>` that bootstraps
  the theme attribute and moved it to
  `frontend/public/theme-bootstrap.js`. The inline form violated the
  CSP `script-src 'self' blob:` deployed on the SPA (45 CSP errors per
  page load in the prior audit).
- frontend/package.json (+ package-lock.json): add an `overrides`
  block pinning `dompurify@^3.4.3` to remediate the transitive
  vulnerability advisory.
- frontend/src/features/dashboard/DashboardPage.tsx: wire the
  time-series chart to real `getEnvActivity` data instead of the
  placeholder constants. Added a 7d / 24h interactive toggle and an
  `aggregateBuckets` helper that collapses the 15-min server buckets
  into N display bins.

== Dev stack ==

deploy/docker/conf/osquery/entrypoint.sh: pin
`--host_identifier=specified --specified_identifier=$(hostname)` so
the three dev osquery containers enroll as distinct nodes instead of
colliding on the host kernel UUID. The full Kali docker stack now
enrolls four unique nodes (the Kali host + three containers).

== Toolchain ==

Bump Go from 1.26.1 to 1.26.3 across go.mod, the four
deploy/docker/dockerfiles/Dockerfile-dev-*, deploy/lib.sh,
.env.example, the .github/actions/{build,test}/binaries action
manifests, and the five .github/workflows/*.yml. 1.26.3 carries the
stdlib CVE fixes flagged by the local govulncheck run.

Verified locally:
- go build ./...  (clean on Go 1.26.3)
- go test  ./...  (all packages green, including new CreateUserManager
  / WithJWT tests)
- frontend: vitest 92/92 green; npm run build clean; npm audit
  --omit=dev → 0 vulnerabilities
- Live smoke on the Kali dev stack: container rebuild + recreate;
  osctrl-cli runs `env show`, `node-actions secret`, `show-flags`
  without the old JWT-too-short fatal; osctrl-api signs a real login
  token through the SPA; the four enrolled osquery nodes remain in
  the DB.
@alvarofraguas alvarofraguas force-pushed the pr/round-3-frontend branch from 8189434 to 2989a56 Compare May 14, 2026 17:19
…, shared rate-limit + audit-log infra

Server-side hardening for osctrl-api, plus shared infrastructure
(rate-limit package, audit-log helpers, trusted-proxies plumbing)
that osctrl-tls also consumes — its consumer-side changes ship in a
companion PR so the TLS-facing surface can be tested in isolation.

== Auth bedrock ==

cmd/api:
  - --auth=jwt is now the default. Refuse to start with --auth=none
    unless OSCTRL_INSECURE_NO_AUTH=1 is set. When opted in, a 60s
    warning ticker keeps the deployment from drifting into
    'auth-off forever'.
  - HttpOnly + Secure cookie session for SPA-style clients
    (osctrl_token). CLI clients with Authorization: Bearer continue
    to work unchanged.
  - Double-submit CSRF (osctrl_csrf cookie + X-CSRF-Token header) for
    mutating cookie-authenticated requests. CLI Bearer flows exempt.
  - JWT signing-algorithm pin (HMAC only) to defeat alg-confusion
    attacks (alg:none / RS256-with-HS256-verify).
  - JWT secret minimum 32 bytes (HS256 needs HMAC key ≥ hash output).
    Startup fails fast with the openssl one-liner if too short.
  - Strict 'forwarded headers' trust via --trusted-proxies. Empty
    default means utils.GetIP ignores X-Forwarded-For / X-Real-IP —
    an internet attacker can't spoof IPs to defeat rate-limits or
    poison audit logs.

== Env secret containment + cross-env defense ==

pkg/types: new TLSEnvironmentView — the low-privilege env projection.
  Omits Secret, EnrollSecretPath, RemoveSecretPath, Certificate, Flags,
  and every other field that materially contributes to enrolling a node.

cmd/api/handlers/environments.go:
  - EnvironmentHandler now branches on access level: AdminLevel (or
    super-admin) gets the full storage struct; UserLevel gets the
    low-priv view.
  - EnvEnrollHandler / EnvRemoveHandler raised from UserLevel to
    AdminLevel — both embed the env's enroll/remove secret.
  - Both handlers log only the target name, not returnData.
  - EnvActionsHandler 'create' branch validates caller-supplied UUID
    via EnvUUIDFilter (rejects malformed) and EnvExists (rejects
    collision). 'delete' branch gets the same validation for symmetry.

cmd/api/handlers/queries.go: QueryResultsHandler now precheck-validates
  the named query belongs to env.ID via h.Queries.Exists(name, env.ID)
  and returns 404 otherwise. logging.GetQueryResults filtered on 'name'
  only, so without this gate a user with QueryLevel on env A could
  pull results from env B by passing B's query name in A's URL.

pkg/environments/environments.go: tighten EnvUUIDFilter regex and add
  axis-pure Exists/UUIDExists helpers so handler checks can match the
  router's expectations exactly.

== Shared rate-limit + audit-log infrastructure ==

pkg/ratelimit (new): per-key token-bucket rate limiter with idle
  eviction. Used by osctrl-api for /login here, and by osctrl-tls for
  /enroll in the companion PR. Tunable burst, window, and key
  function (KeyByIP today; KeyByIPAndEnv available).

pkg/auditlog/audit.go: FailedLogin + FailedEnroll helpers — a clean
  stream of authn/enrol failures for SoC tooling to alert on
  brute-force, password-spray, and enroll abuse.

pkg/utils/http-utils.go: SetTrustedProxies + an updated GetIP that
  honors the trusted-proxies set. Empty (default) ignores
  X-Forwarded-For / X-Real-IP entirely.

== SQL hardening + carve path safety ==

pkg/carves/utils.go: new ValidCarvePath regexp gate. Without this gate
  a CarveLevel operator could pass \`'; SELECT 1; --\` and pivot 'carve
  a file' into 'run any SELECT against your fleet' via GenCarveQuery's
  string concat.

cmd/api/handlers/carves.go (CarvesRunHandler): path validated before
  the SQL splice. Rejected paths return 400.

== Authz + audit-log hardening ==

pkg/users:
  - bcrypt cost raised from default (10) to 12. CheckLoginCredentials
    opportunistically re-hashes existing users at next login (no
    password reset needed). Rehash failure is non-fatal.
  - New ClearToken empties APIToken AND CSRFToken so any existing JWT
    + CSRF cookie pair stops validating. Used by future
    DELETE /api/v1/users/{username}/token in a follow-up PR.

cmd/api/handlers/{users,settings,environments}.go: authz tightenings
  around permission writes, settings PATCH, and env-action service-name
  validation.

pkg/environments/env-cache.go: keep the 2h cleanup interval; introduce
  an envCacheTTL constant so the value is self-documenting and tunable
  locally without changing runtime defaults.

== Defaults + ops ==

deploy/config/{api,admin}.yml: flip --audit-log default to true so
  audit log writes are on by default. Operators can disable with
  --audit-log=false.

Verified: go build ./... clean, go vet ./... clean, go test ./pkg/...
./cmd/api/... ./cmd/tls/... all green.
Round 2 of 3 (round 1: security; round 3: frontend). Adds the API
surface the SPA needs to fully replace the legacy admin templates.
No existing routes are removed or repurposed — every new endpoint is
additive. The new shapes are SPA-canonical (paginated envelope,
projections, typed PATCH bodies).

== New endpoints ==

Stats / dashboard:
  GET /api/v1/stats                                  cross-env summary KPIs
  GET /api/v1/stats/osquery-versions                 fleet agent versions
  GET /api/v1/stats/activity/{env}                   env-scoped audit-log activity heatmap
  GET /api/v1/stats/activity/node/{env}/{uuid}       per-node activity heatmap
  GET /api/v1/stats/activity/node-batch/{env}        per-node heatmap, up to 100 uuids

Logs (live SPA log viewer):
  GET /api/v1/logs/{type}/{env}/{uuid}               paginated, since-aware

Saved queries (full CRUD):
  GET    /api/v1/saved-queries/{env}
  POST   /api/v1/saved-queries/{env}
  PATCH  /api/v1/saved-queries/{env}/{name}
  DELETE /api/v1/saved-queries/{env}/{name}

User profile + token + permissions:
  GET    /api/v1/users/me
  PATCH  /api/v1/users/me
  POST   /api/v1/users/me/password
  POST   /api/v1/users/{username}/permissions
  POST   /api/v1/users/{username}/token/refresh
  DELETE /api/v1/users/{username}/token

Environment CRUD + config PATCHes:
  POST   /api/v1/environments
  PATCH  /api/v1/environments/{env}
  DELETE /api/v1/environments/{env}
  GET    /api/v1/environments/{env}/config
  PATCH  /api/v1/environments/{env}/config
  PATCH  /api/v1/environments/{env}/intervals
  PATCH  /api/v1/environments/{env}/expiration

Settings PATCH:
  PATCH  /api/v1/settings/{service}/{name}

Audit log filters + pagination:
  GET    /api/v1/audit-logs?service=&username=&type=&envUuid=&since=&until=&page=&pageSize=

Login envs (pre-auth env list):
  GET    /api/v1/login/environments                  pre-auth-safe UUID+name only

Sample libraries (operator starter packs):
  GET    /api/v1/queries/samples
  GET    /api/v1/carves/samples
  GET    /api/v1/osquery/tables

== Pagination + sort + search ==

Every list endpoint accepts ?page=&page_size= (default 50, max 500) and
returns the envelope:
  { "items": [...], "page": N, "page_size": N, "total_items": N, "total_pages": N }

Sortable fields use a per-resource SortableColumns allowlist enforced
at the package layer (pkg/nodes, pkg/queries, pkg/carves). Unknown sort
keys fall back to the resource's default order without 400ing.

Search is ?q= free-text against a per-resource field set (case-insensitive
LIKE). Wildcards are escaped server-side.

== New package: pkg/dbutil ==

Dialect-aware SQL bucket-expression helper (postgres / mysql / sqlite)
used by the activity heatmap endpoints. Each category (status logs /
result logs / distributed queries / carves) issues a single SQL
GROUP BY rather than plucking every timestamp — at 50k+ nodes the
table-page heatmap query is bounded by the index instead of the
chatty-row count.

== Package-layer additions ==

  pkg/nodes: GetByEnvPaged, NodeView projection, SortableColumns,
             platform-bucket helpers, GetOsqueryVersionCounts.
  pkg/queries: GetByEnvTargetPaged, GetSaved* CRUD, SortableColumns,
               sample-template loader, GetNodeQueryBucketed.
  pkg/carves: GetByEnvPaged, sample-template loader,
              GetNodeCarveBucketed.
  pkg/environments: Create / Update / Delete, UpdateConfig /
                    UpdateIntervals / UpdateExpiration helpers.
  pkg/auditlog: GetPaged with PageFilter; FailedLogin / FailedEnroll
                hooks; GetEnvActivityBucketed for the heatmap.
  pkg/logging: GetNodeLogs with ?q= search filter,
               GetNode{Status,Result}Bucketed for the heatmap.
  pkg/osquery: LoadTables (osquery schema for the SPA query editor).
  pkg/types: NodeView, paginated response envelopes, EnvCreate /
             EnvUpdate / EnvConfig* request types, SettingPatchRequest,
             SavedQueryView, AdminUserView.

Verified: go build ./... clean, go vet ./... clean, go test ./... all
packages pass. End-to-end tested against a Kali docker deployment.

== What this depends on ==

This PR is stacked on the security-hardening PR (auth bedrock, env
secret containment, TLS-side rate-limit). When that PR is merged
upstream, this branch will be re-targeted at the new main HEAD.

== What this enables ==

A separate round-3 PR will land the React admin SPA under a new
`frontend/` directory at the repo root. The SPA consumes only the
endpoints in this PR — no admin-template surface is touched.
Round 3 of 3. Lands the React + TypeScript + Vite SPA under a new
`frontend/` directory at the repo root. The SPA is fully separable from
the legacy `osctrl-admin` templates — both can run side-by-side during a
migration window, and the legacy admin is not touched by this PR.

== Tech stack ==

- React 19 + TypeScript 5 (strict)
- Vite 7 (build), @tailwindcss/vite (styling), Tailwind CSS v4
- TanStack Router (typed file-based routing)
- TanStack Query 5 (server state, polling + cache)
- TanStack Table 8 (headless tables)
- react-hook-form 7 + zod 3 (forms + validation)
- Radix UI primitives (à la carte, unstyled)
- lucide-react (icons; tree-shaken, no emoji)
- Monaco editor (lazy-loaded for the osquery / config editor)
- Vitest + @testing-library/react + jsdom (component tests)

Bundle: ~780KB JS / ~52KB CSS pre-compression; ~214KB JS + ~9KB CSS
after gzip. Monaco is code-split into its own chunk so the initial
load doesn't pay the editor cost on pages that don't need it.

== Pages (covering parity with the legacy admin) ==

- Login (env picker + creds, pre-auth env list)
- Dashboard (cross-env KPIs, per-env tile, agent-version panel,
  active-queries progress, recently-seen nodes, failed-enroll watch)
- Nodes table (paginated, sortable, searchable; quick-filters; 4×24h
  activity heatmap per row)
- Node detail (system info, status logs, result logs, distributed
  queries, carves, activity tab with interval picker)
- Queries list + run form (target selector, Monaco SQL editor with
  osquery-table autocomplete, expHours)
- Query detail (paginated virtual-scroll results, CSV export,
  search-from-result-cell → SQL-template)
- Saved queries (CRUD)
- Carves list, run form, detail (archive download)
- Tags (env-scoped + global)
- Users (list, permissions modal, token modal)
- Profile (display name, password change, token refresh)
- Environments (list, create, edit) + Monaco-based env config editor
  (options / schedule / packs / decorators / ATC) with DiffView
- Enroll page (per-OS one-liners + downloads)
- Audit log (paginated, filtered)
- Settings (per-service, typed inputs)

== Design system ==

- Custom osctrl tokens (dark default, full light parity, signal-teal
  accent #2bc4be / #0a8a85, semantic status colors with icons not
  color-only).
- Density modes (comfortable / compact / dense) via CSS custom
  properties.
- Tabular nums, Inter + Space Grotesk + IBM Plex Mono.
- Restrained motion (120–220ms transitions, reduced-motion honored).
- Single-accent rule: one signal-teal element active per screen.

== Routing ==

TanStack Router with a file-based tree under `frontend/src/routes/`.
The `_app` segment is the authenticated shell that wraps every page
behind the AppShell (top bar + side nav + env switcher). Login at
`/login/$env` is outside `_app`.

== Auth ==

- HttpOnly cookie session (`osctrl_token`) set by the API on login.
- Double-submit CSRF (`osctrl_csrf` cookie + `X-CSRF-Token` header)
  managed via a thin in-memory token store + request interceptor.
- 401 from any endpoint redirects to `/login/$env?next=...`.

== Deployment ==

Three patterns, in `deploy/`:

1. nginx (recommended): `deploy/nginx/frontend.conf.example` shows
   the production pattern (root + try_files for the SPA, /api/* to
   osctrl-api, baseline security headers, immutable cache for hashed
   assets, no-cache for index.html).
2. Docker (`deploy/docker/dockerfiles/Dockerfile-osctrl-frontend`):
   multi-stage (node:20 → nginx:alpine), single image with the SPA
   pre-built + nginx pre-configured.
3. Static hosting + CDN: ship `frontend/dist/` to S3/Cloudfront/etc.,
   configure CORS on osctrl-api.

The dev compose stack adds an `osctrl-frontend` service that builds
the same multi-stage image and serves on :8088 alongside the legacy
admin on :8443 — operators can compare side-by-side on the same data.

== Make targets ==

- `make frontend-install` — npm ci
- `make frontend-dev` — Vite dev server on :5173 (proxies /api → :8081)
- `make frontend-test` — vitest + tsc
- `make frontend-build` — produces frontend/dist/
- `make frontend` — install + build (CI / Docker shorthand)

== CI ==

`.github/workflows/frontend-build.yml`:
- Pinned action SHAs (matches the existing osctrl convention)
- typecheck + tests + build
- forbid `dangerouslySetInnerHTML` (CI gate — every node-originating
  field must be JSX-escaped; future contributors get a build break
  instead of silent XSS regression)
- Uploads dist/ as a build artifact

== Test plan ==

- [x] `npx tsc --noEmit` — clean
- [x] `npx vitest run` — 19 files, 92 tests pass
- [x] `npm run build` — produces frontend/dist/ cleanly
- [x] Backend untouched: `go build ./...`, `go vet ./...`, all 14
  Go packages' tests still pass
- [x] End-to-end smoke against a Kali docker deployment

== What this depends on ==

Stacked on the previous two PRs:
- Security hardening (auth bedrock, CSRF, env secret containment,
  TLS rate-limit)
- API extensions (paginated lists, stats, saved-queries CRUD,
  user/permissions/tokens, env config PATCHes, audit-log filters)

When those merge, this branch will be re-targeted at the new
main HEAD with no conflicts.
Consolidated follow-up that lands on top of the three stacked PRs to
address lint, a real bug uncovered by lint, a stronger JWT-secret
contract, and a few deployment-correctness items.

== Lint cleanup (golangci-lint on PR jmpsec#815) ==

- pkg/auditlog/audit.go, pkg/dbutil/buckets.go: drop the redundant
  `.Dialector` selector on `*gorm.DB` (QF1008). `Dialector` is an
  embedded interface so the promoted `Name()` works directly.
- cmd/api/handlers/utils.go: remove the unused `postgresQueryLogs`
  function (unused). Pre-existing dead code that surfaced once the
  package was touched by other PRs in the stack.
- cmd/admin/handlers/json-nodes.go: annotate the two legacy admin
  callers of `Nodes.SearchByEnvPage` / `Nodes.GetByEnvPage` with
  `//nolint:staticcheck // SA1019: intentional legacy admin caller;
  new SPA uses GetByEnvPaged`. The deprecation tag is correct — the
  legacy admin will migrate to `GetByEnvPaged` when it adopts the
  SPA's pagination shape; until then these calls are gated by the
  package-layer `SortableColumns` allowlist and are safe.

== Real bug uncovered by ineffassign ==

cmd/api/handlers/environments.go: in the `"create"` action, the
tag-creation failure path set `msgReturn = fmt.Sprintf("error
generating tag %s ", err.Error())` and then `return`-ed without ever
writing to the response. Result: the API returned the request body's
buffered HTTP 200 (or no body at all) on a real failure, masking the
error from the client. Replaced with a proper `apiErrorResponse(w,
"error generating tag", http.StatusInternalServerError, err)`.

== JWT secret contract: decouple user-manager construction from
   token-signing config ==

Round 1 added `MinJWTSecretBytes = 32` to `users.CreateUserManager`,
which `log.Fatal`s when the JWT secret is shorter than 32 bytes. This
was correct for the API and admin services (they sign tokens) but
caught the CLI (`cmd/cli`) by surprise — the CLI doesn't mint JWTs at
all, it just manages user/permission rows directly, and was passing
`appName` ("osctrl", 6 bytes) as a placeholder. Every CLI invocation
with `--db` would have aborted with "JWT Secret too short" once
Round 1 lands.

Fix: split the constructor.

  // DB-only constructor — never validates JWT.
  func CreateUserManager(backend *gorm.DB) *UserManager

  // Attach JWT signing config; validates the secret here.
  func (u *UserManager) WithJWT(*config.YAMLConfigurationJWT) *UserManager

Token-issuing callers chain:

  apiUsers   = users.CreateUserManager(db.Conn).WithJWT(flagParams.JWT)
  adminUsers = users.CreateUserManager(db.Conn).WithJWT(flagParams.JWT)

The CLI calls only `CreateUserManager(db.Conn)`. `CreateToken` now
returns an error if invoked on a manager without `WithJWT` —
defense-in-depth so a future caller can't accidentally sign tokens
with a nil config.

== TLS handler: per-endpoint body caps ==

cmd/tls/handlers/post.go: wrap every osquery-facing `io.ReadAll` in
`http.MaxBytesReader` with an endpoint-appropriate ceiling. Without
this, a misbehaving or hostile node can submit an arbitrarily large
body and force the server to buffer it before parsing. Caps are
chosen against real osquery payload sizes:

  enroll:           64 KiB
  config:           64 KiB
  log:             100 MiB   (osquery log batches)
  query read:       16 KiB
  query write:     100 MiB
  carve init:        8 KiB
  carve block:      16 MiB
  quick-enroll:      8 KiB
  flags / cert:      8 KiB
  verify / script:   8 KiB
  osquery config:    2 MiB

A single `readBody(w, r, max)` helper applies the cap and reads, so
the call sites stay one-line.

== Carve compression: out-of-bounds panic guard ==

pkg/carves/utils.go: `CheckCompressionRaw` previously dereferenced
`data[:4]` to compare against the zstd header. A truncated or empty
block (which an authenticated CarveLevel client can submit) would
panic with index-out-of-range. Guarded with a length check.

== Frontend (PR jmpsec#815 scope) ==

- frontend/index.html: removed the inline `<script>` that bootstraps
  the theme attribute and moved it to
  `frontend/public/theme-bootstrap.js`. The inline form violated the
  CSP `script-src 'self' blob:` deployed on the SPA (45 CSP errors per
  page load in the prior audit).
- frontend/package.json (+ package-lock.json): add an `overrides`
  block pinning `dompurify@^3.4.3` to remediate the transitive
  vulnerability advisory.
- frontend/src/features/dashboard/DashboardPage.tsx: wire the
  time-series chart to real `getEnvActivity` data instead of the
  placeholder constants. Added a 7d / 24h interactive toggle and an
  `aggregateBuckets` helper that collapses the 15-min server buckets
  into N display bins.

== Dev stack ==

deploy/docker/conf/osquery/entrypoint.sh: pin
`--host_identifier=specified --specified_identifier=$(hostname)` so
the three dev osquery containers enroll as distinct nodes instead of
colliding on the host kernel UUID. The full Kali docker stack now
enrolls four unique nodes (the Kali host + three containers).

== Toolchain ==

Bump Go from 1.26.1 to 1.26.3 across go.mod, the four
deploy/docker/dockerfiles/Dockerfile-dev-*, deploy/lib.sh,
.env.example, the .github/actions/{build,test}/binaries action
manifests, and the five .github/workflows/*.yml. 1.26.3 carries the
stdlib CVE fixes flagged by the local govulncheck run.

Verified locally:
- go build ./...  (clean on Go 1.26.3)
- go test  ./...  (all packages green, including new CreateUserManager
  / WithJWT tests)
- frontend: vitest 92/92 green; npm run build clean; npm audit
  --omit=dev → 0 vulnerabilities
- Live smoke on the Kali dev stack: container rebuild + recreate;
  osctrl-cli runs `env show`, `node-actions secret`, `show-flags`
  without the old JWT-too-short fatal; osctrl-api signs a real login
  token through the SPA; the four enrolled osquery nodes remain in
  the DB.
@alvarofraguas alvarofraguas force-pushed the pr/round-3-frontend branch from 2989a56 to a02a826 Compare May 14, 2026 17:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

✨ enhancement New feature or request ⭐️ frontend Frontend related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants